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Massive information has been transmitted through complicated network 
connections around the world. Thus, providing a protected information 
system has fully consideration of many private and governmental institutes to 
prevent the attackers. The attackers block the users to access a particular 
network service by sending a large amount of fake traffics. Therefore, this 
article demonstrates two-classification models for accurate intrusion 
detection system (IDS). The first model develops the artificial neural network 
(ANN) of multilayer perceptron (MLP) with one hidden layer (MLP1) based 
on distributed denial of service (DDoS). The MLP1 has 38 input nodes, 11 
hidden nodes, and 5 output nodes. The training of the MLP1 model is 


implemented with NSL-KDD dataset that has 38 features and five types of 


Intrusion detection system requests. The MLPI1 achieves detection accuracy of 95.6%. The second 


Multilayer perceptron model MLP2 has two hidden layers. The improved MLP2 model with the 
same setup achieves an accuracy of 2.2% higher than the MLP1 model. The 
study shows that the MLP2 model provides high classification accuracy of 
different request types. 
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1. INTRODUCTION 

The hackers have attacked the cyber system every single day. The information security companies 
and governments of various countries have spanned a significant consideration to prevent the distributed 
denial of service (DDoS) attacks [1-2]. The conventional cyber a threat on the web serves as the DDoS. 
However, sending huge packets to the web servers from attackers’ tools is called the denial of service attack 
(DoS). All of the network, transport, and application layers have been used ICMP, TCP, and HTTP protocols, 
respectively, to prevent the DoS attacks [3-4]. Figure 1 illustrates a basic topology of the DDoS attack. 
Whereas, the attacker controls a large number of servers that sending the packets to the victim. The attacks 
attempt to block legitimate users to access a particular network service by sending a large amount of fake 
traffics to the victim network continuously [5]. The hackers have utilized the Botnets to realize the aim of 
DDoS attacks. Any network is created by a host computer are Botnets, while that is managed by some 
attackers are the network formed by called botmasters [6]. It is noted that the attackers send a large number 
of requests to a system during a short time, which makes that system hanging. Thus, the DDoS attack takes a 
shorter time than the DoS. Therefore, an accurate and fast intrusion detection system (IDS) is strongly 
enquired. 
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Figure 1. The architecture of DDoS attack 


The mechanism of the IDS is integrated based on the classification of the host or network. The 
precision of classification the normal and anomaly requests gives the IDS accuracy. The classification system 
can recognize the DDoS attack based on traffics behavior. Many classification techniques were used to 
improve the IDS’s accuracy. Machine-learning algorithms have been widely utilized for IDSs. Various 
artificial intelligence detection methods, namely, bayesian networks, [7] fuzzy logic, [8] genetic algorithms, 
[9] clustering, multilayer perceptron (MLP), [10] artificial neural networks (ANN), [11] software agent 
technology, and support vector machines (SVM), [12] are implemented for IDS application [2]. 

The ANN is one of the implemented techniques to prevent DDoS attacks. This section addresses a 
survey of the articles used ANN for IDS. The artificial neural network classification-based IDS was 
introduced. [13] The ANN model used NSL-KDD dataset with 29 features. The model achieved an accuracy 
of 81.2%. Then, Mane proposed ANN to detect the attack traffics which is misleading with the normal 
traffics [14]. Only 20000 samples of KDD99 dataset and (17 of 41) features were used to evaluate the 
performance of the proposed model. The system realized the accuracy of 98%. The selecting effective 
features for the ANN model that improved the system accuracy. Tsai in [15] proposed a time-delay neural 
network that can be used as an early detection system of DDoS attacks. This method is implemented based 
on the time parameter of each request. The system is tested a manually generated dataset. The system 
outcome displayed accuracy of 82%. In addition, it can detect a few types of attacks. 

On the other hand, the MLP has been developed in several studies for IDS applications. The MLP 
includes multiple hidden layers to improve the learning rate of the neural network. Singh developed the MLP 
with a genetic algorithm to improve the detection efficiency of the IDS model [16]. The literature shows 
different types of the dataset that used to evaluate the IDS system, such as CIDA, DARBA FIFA world cup 
and KDD-NSL [2]. The selection of the effective features of the CAIDA 2007 dataset with genetic algorithm 
was improved the system performance in terms of DDoS classification. Wang used the MLP to overcome the 
DDOS attack problem [17]. Where the MLP was combined with dynamic features selection. During the 
training process, the model chose the optimal features according to the feedback checking of the errors of 
each epoch [18]. Therefore, an analyses investigation is recommended to highlight the effective detection 
method based on neural network algorithms. 

This article provides two MLP models of ANN for classification-based IDS. The MLP based IDS 
system design with one hidden layer (MLP1) and two hidden layers (MLP2) to improve classification 
accuracy. The article is organized into several sections as follows. In section II, the design of IDS models is 
addressed and divided into three subsections, which are the methods and materials section (dataset, MLP1, 
and MLP2). After that, the simulation of the two proposed models and the evaluation of the results are 
addressed in Section 3. Finally, Section 4 concludes the study’s achievements. 


2. METHODS AND MATERIALS 
In this section, brief backgrounds about the tested dataset the selected dataset’s features are 
demonstrated. Moreover, the investigated ANN-based classification IDS methods are addressed. 
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2.1. Testing dataset 

Among the different types of datasets, the NSL-KDD dataset is selected due to its variety of features 
that could be suitable to evaluate the performance of our proposed model. The available NSL-KDD dataset in 
[19-21] was utilized to evaluate the ANN model in this study. The NSL-KDD Dataset is the update version 
of 99KDD dataset [22-24]. The NSL-KDD was recorded according to a practical 125973 requests of a 
website in 2013. These requests were monitored and recorded with several features. The NSL-KDD dataset 
includes 42 features [2]. During the pre-processing, some of these features were excluded to meet the 
application requirements. Where the request type features were extradited, and then it is set as the target of 
the training IDS model. It is noted that the NSL-KDD dataset includes five types of requests, which are DOS 
of 45927, Probe of 52, R2L of 995, U2R of 11656, and Normal of 67343, as displayed in Figure 2. 


DoS, 45927 


U2R, 11656 


\_ Probing, 52 
_ R2L, 995 


Figure 2. The request types of the NSL-KDD dataset 


Some of the unwanted features, such as Flg, TCP and HTTP were removed from the dataset. 
Therefore, the remained 38 features of the selected dataset were imported as inputs of the IDS model. 
Specifically, 88181 samples of the dataset are used for training while 377912 samples are used for testing and 
validation. 


2.2. The IDS framework 

The ANN is modelled with one hidden layer (MLP1) and two hidden layers (MLP2) to improve 
classification accuracy. Where pre-pressing has selected the input and target features from the NSL-KDD 
dataset. In addition, 70% of the requests are imported to the ANN for the training process while the other 
30% of the requests are used for testing. This training is implemented with several stopping criteria. For 
example, the KDD dataset is trained with 100 epochs. Also, the mean square error of 10-6 is set to stop the 
system training. After that, the bias and weights of the ANN are saved for testes process. The proposed ANN 
for IDS system is implemented by using MATLAB software. The IDS framework was designed to classify 
the request type according to the selected features, as shown in Figure 3. 


Results Evaluation 


Figure 3. The IDS framework 
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2.3. The MLP1 model 

The ANN has been developed by using a computer model of multilayer perceptron with one hidden 
layer (MLP1). The MLP! architecture is build based on the biological neuron. Where the information flows 
in the neurons changes the learning of the model based on its structure. The output of the MLP1 is used to 
check the error and update the model weight. The simple MLP1 configuration has three layers, which are 
input layer, hidden layer and output layer. The information flows through linkages among the layers. The 
output layer values are computed by multiplying the input information with the weight of the linkage 
according to the used activation function. The error can be calculated by comparing the output of MLP1 with 
the data target. This error is utilized to update the MLP1 weight by using backpropagation architecture . 
Increasing the learning iterations meets the best weight and reduces the error 

To design IDS, the MLP1 is implemented for the aureate classification system. The proposed MLP1 
is developed and simulated by using MATLAB. The proposed MLP1 for IDS is designed with three layers, 
as displayed in Figure 4(a). The selected 38 features of NSL-KDD data are organized and used as inputs for 
the MLP1. Therefore, the input layer includes 38 input nodes one node for each input feature. The dataset 
includes so many requests. Each request has 38 features that imported to the design MLP1 as input. Whereas 
every single feature of a request is imported to an input node. The features of all the requests in the dataset 
are used to train the MLP1 respectively. Moreover, a single hidden layer is configured with 11 nodes. The 
hidden layer is activated by using the single-bias of each node. Hence, a sigmoid activation function is used 
to activate the neurons. The sigmoid is vertically normalized the training features between 0 and 1. The 
hidden layer is connected with the output layer of the ANN. Since the requests of the dataset are labelled into 
five different types, so the output layer is configured with 5 nodes. Therefore, the proposed MLP1 for IDS 
can be classified as requests into five types as shown in Figure 4(a). Each node of the hidden layer is 
connected with all the output nodes by using five neurons. More details about the implementation are 
addressed in the next paragraph. 


2.4. The MLP2 model 

Generally, the MLP2 has the same structure of the MLPI except that multiple hidden layers are 
connected in series. The MLP2 includes a large number of inner connections of neurons to solve a problem. 
Several hidden layers are linked between the input layer and the output layer, as displayed in Figure 4(b). The 
input of the next layer is generated by applying an activation function to the output of the previous layer. The 
sigmoid activation function has widely used in the forward-propagation neural network training. The error is 
obtained by comparing the network output with the data target. The back-propagation adjusts the MLP2 
weight vector. The discussed process for all the input vectors calls training epoch, which is repeated for 
several epochs until meeting the stop criteria. 
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Figure 4. The MLP topology 
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The MLP2 is designed in this section to classify the NSL-KDD dataset for IDS application. The 
proposed PLM is designed with four layers (1 input, 2 hidden, and | output layers). The MLP2 layers’ have 
the same number of nodes in the proposed ANN model. Where the input layer has 38 nodes, the hidden 
layers have 11 nodes and the output layer has 5 nodes. Figure 4 illustrates the MPL architecture with its four 
layers connection. Feed-forward back-propagation has connected the neurons among the layers’ nodes. 
In addition, the sigmoid activation function is used with normalization the input features values between 0 
and 1. Same the pre-processing that used for the ANN model is also implemented for this MLP2 model. 
Moreover, the sopping criteria (no of epochs= 100 and mean square error of 10-6) are used for the proposed 
MLP model. The request type of the tested dataset is set as a target while the selected 38 features are 
imported as inputs of the MLP IDS model. The model is implemented by using MATLAB for 100 epochs 
training. The weights and biases of the MLP2 are recorded for the evaluation process. 


3. SIMULATION EVALUATION AND RESULTS 

The MLP1 and MLP2 models are implemented by using MATLAB software, and 64 bit 
Windows 8. The used computer has specifications of Intel CPU core i7 @ 2.10 GHz with RAM of 4 GB. The 
performance of the system is computed by analyzing the results of the tested dataset. 

The last 30% of the NSL-KDD dataset is used to test both the proposed model. Firstly, the proposed 
MLP1 with the recorded weights is implemented with the testing dataset. The output classification results of 
the tested MLP1 is recorded and compared with the target (request type feature). The confusion matrix is 
used to campout the classification accuracy of the MLP1 model, as illustrated in Figure 5. The MLP1 model 
achieves the best accuracy of 95.62% using (1) [25-31] after 100 epochs. 


(TP+TN) 
(IP+IN+FN+FP) (1) 


Accuracy = 


where TN, TP, FN, FP denotes true negative, true positive, false negative and false positive, 
respectively. 

Secondly, the MLP2 model with the recorded weights is implemented by using the testing dataset. 
The outputs of MLP2 in IDS are recorded and validated with the target. The system accuracy is calculated by 
using (1). Figure 6 shows the confusion matrix of the proposed system. It can be noted that the MLP2 
realized the best accuracy of 97.82% after 100 epochs. 
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Figure 5. The confusion matrix and classification Figure 6. The confusion matrix and classification 
performance of the MLP1 performance of the MLP2 


Generally, the proposed MLP2 model demonstrates the accuracy of 97.82%, which is observed by 
2.2% higher than the MLP1! model. Figure 7 shows the evaluation accuracy results for both MLP1 and MLP2 
models. 
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Figure 7. The accuracy evaluation between the MLP1 and MLP2 for 100 epochs 


From the results, it can be argued that the additional hidden layer increased the learning rate more 
than the single hidden layer. In addition, both of the MLP1 and MLP2 are failed to detect the Probing request 
type. There are 12 Probing requests are implemented in the testing dataset. Therefore, a further investigation 
of the selected features and the developed IDS models is required. 


4. CONCLUSION 

In this article, two types of MLP are designed and modelled for IDS applications. The NSL-KDD 
dataset is used for both of the training and testing stages. Only 38 of 42 features are used for the classification 
stage. Same setup and parameters are used of both MLPs models. The evaluation study is demonstrated in 
this paper by the MLP2 model with two hidden layers achieves the accuracy of 97.82% while the MLP1 
model achieves the accuracy of 95.62%. Subsequently, it is observed that increasing the number of hidden 
layers may improve the learning rate of ANNs. Utility, the proposed MLP-based IDS demonstrate a powerful 
tool for security applications. Further investigation is required in the features selection. 
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